Communication system, user equipment and method of performing a conference call thereof

ABSTRACT

This invention relates to a communication system ( 200 ), user equipment ( 201,211 ) and method of performing a conference call. The communication system ( 200 ) employs discontinuous transmission and has a plurality of user equipment ( 201, 211, 213, 215 ) and a fixed network ( 205, 207, 208 ). It further comprises a speech metric processor ( 225 ) for determining if speech is originating at the user equipment. The system also comprises a silence descriptor frame generator ( 229 ) which generate silence descriptor frames if no speech is originating and a conference call processor ( 231 ) for determining if a call involving the user equipment is a conference call. Furthermore the communication system comprises means ( 229, 239 ) for suppressing silence descriptor information of the silence descriptor frames if the call is a conference call. The invention is applicable to cellular mobile radio communication systems such as UMTS.

FIELD OF THE INVENTION

This invention relates to a communication system, user equipment andmethod of performing a conference call therefor.

BACKGROUND OF THE INVENTION

In a cellular communication system, each of the subscriber units(typically mobile stations) communicates with typically a fixed basestation. Communication from the subscriber unit to the base station isknown as uplink and communication from the base station to thesubscriber unit is known as downlink. The total coverage area of thesystem is divided into a number of separate cells, each predominantlycovered by a single base station. The cells are typically geographicallydistinct with an overlapping coverage area with neighbouring cells. FIG.1 illustrates a cellular communication system 100. In the system, a basestation 101 communicates with a number of subscriber units 103 overradio channels 105. In the cellular system, the base station 101 coversusers within a certain geographical area 107, whereas other basestations 113, 115 cover other geographical areas 109, 111. Some overlapareas 117 can be covered by more than one cell.

As a subscriber unit moves from the coverage area of one cell to thecoverage area of another cell, the communication link will change frombeing between the subscriber unit and the base station of the firstcell, to being between the subscriber unit and the base station of thesecond cell. This is known as a handover. Specifically, some cells maylie completely within the coverage of other larger cells.

All base stations are interconnected by a network. This networkcomprises communication lines, switches, interfaces to othercommunication networks and various controllers required for operatingthe network. The base stations themselves can also be considered part ofthe network. A call from a subscriber unit is routed through the networkto the destination specific for this call. If the call is between twosubscriber units of the same communication system the call will berouted through the network to the base station of the cell in which theother subscriber unit currently is. A connection is thus establishedbetween the two serving cells through the network. Alternatively, if thecall is between a subscriber unit and a telephone connected to thePublic Switched Telephone Network (PSTN) the call is routed from theserving base station to the interface between the cellular mobilecommunication system and the PSTN. It is then routed from the interfaceto the telephone by the PSTN.

Known cellular communication systems such as GSM uses discontinuoustransmission whereby the transmissions from and to a base station isreduced when there is no voice activity on the link, for example duringpauses in natural speech or when the other party is speaking. Thissignificantly reduces the total power transmitted and thus reducesbattery power drain and the interference caused to other subscriberunits.

Conference calls are widely used in conventional landline telephony andare also becoming an increasingly popular way of conducting meetingsusing mobile telephony.

In a conference call, parties to the call may be calling from manydifferent environments, some of which may have a high back ground noiselevel (a particular case is a mobile call from a moving car).

Discontinuous transmission (known as DTX for GSM) is in GSM implementedby first detecting that speech is not being transmitted thentransmitting a silence descriptor frame at intervals of about 9 speechframes. Silence descriptor frames are widely used in cellularcommunication systems employing discontinuous transmission and containinformation related to the background noise in the absence of voiceactivity. This enables the voice decoder to generate background noisecorresponding to the background noise of the originating subscriberunit. This is very useful in two way communication, as the other partyby hearing the background noise of the remote subscriber unit is awarethat the link is still established and that the call has not beendropped. However, although DTX reduces the power transmitted, it alsoperforms the function of simulating a high level of background noise atthe far end of the link. This is disadvantageous in a conference call asthe total background noise from all participants may significantlyreduce the perceived quality. Thus, within a conference call DTX doesnot help the perceived quality of the multi party call.

Conference bridges typically operate a function to mute to some extentcalling parties that are not active on the basis of sound volume.Typically, such a conference bridge is operated as part of the PSTN andas the sound volume of the background noise (especially from a mobilestation) may be quite significant, the background noise may not be mutedand the active speech signal generated may be disturbed by thebackground noise from the non-active participants.

Hence, there is a need for an improved method and system for performingconference calls in a communication system employing discontinuoustransmission.

SUMMARY OF THE INVENTION

The inventors of the current invention have realised that conventionalapproaches for performing a conference call in a communication systememploying discontinuous transmission are suboptimal and can be improved.The invention seeks to provide an improved method and system forperforming conference calls in a communication system employingdiscontinuous transmission.

Accordingly there is provided a method of performing a conference callin a communication system employing discontinuous transmission andhaving a plurality of user equipment and a fixed network, the methodcomprising the steps of: determining if speech is originating at a firstuser equipment of said plurality of user equipment; generating silencedescriptor frames if no speech is originating at the first userequipment; determining if a call involving the first user equipment is aconference call; and suppressing silence descriptor information of thesilence descriptor frames if the call is a conference call.

In contrast to a conventional call where omission of background noise isvery disturbing and unpleasant to the user, the method thussignificantly improves quality for conference calls by significantlyreducing noise.

Preferably the suppression of the silence descriptor information is bysetting the silence descriptor information of the silence descriptorsubstantially similar to that indicating silence and the step ofdetermining if speech is originating at a first user equipment isperformed by comparing the characteristics of a signal originating atthe first user equipment to characteristics typical of a speech signal.

According to one feature of the invention, the communication system is acellular communication system wherein the user equipment communicatesthrough radio signals with at least one base station of the network.

In accordance with a second aspect of the invention, there is provided acommunication system employing discontinuous transmission and having aplurality of user equipment and a fixed network, the communicationsystem comprising: means for determining if speech is originating at afirst user equipment of said plurality of user equipment; means forgenerating silence descriptor frames if no speech is originating at thefirst user equipment; means for determining if a call involving thefirst user equipment is a conference call; and means for suppressingsilence descriptor information of the silence descriptor frames if thecall is a conference call.

In accordance with a third aspect of the invention, there is provided auser equipment employing discontinuous transmission and comprising:means for determining if speech is originating at the user equipment;means for generating silence descriptor frames if no speech isoriginating at the user equipment; means for determining if a callinvolving the first user equipment is a conference call; and means forsuppressing silence descriptor information of the silence descriptorframes if the call is a conference call.

BRIEF DESCRIPTION OF THE DRAWINGS

An embodiment of the present invention is described below, by way ofexample only, with reference to the Drawings, in which:

FIG. 1 is an illustration of a cellular communication system accordingto prior art;

FIG. 2 is an illustration of a communication system in accordance withan embodiment of the invention; and

FIG. 3 illustrates a flow chart of a method of performing a conferencecall in accordance with an embodiment of the invention.

DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT

The following description focuses on an embodiment compliant with acellular communication system, such as GSM or UMTS (Universal MobileTelecommunication System), but it will be apparent that the invention isnot limited to this application.

FIG. 2 illustrates a communication system 200 in accordance with anembodiment of the invention.

A transmitting user equipment (UE) 201 (such as typically e.g. a userterminal, subscriber unit, mobile station or any other suitable device)communicates over a radio link 203 with a base station 205. The basestation 205 is connected to an interconnecting fixed network 207. Thenetwork 207 comprises base station controllers, master switch centres,radio network controllers, operations and maintenance centres and anyother network components required for implementing a desired networkconfiguration as is well known in the art. The network 207 is furtherconnected to a second base station 209 which over a radio link 209communicates with a receiving UE 211. FIG. 2 also shows furthertransmitting UEs 213 and 215 communicating with the network through basestations 217, 219.

For clarity the UEs have been shown as transmitting or receiving UEs butit is clear that typically a UE will comprise functionality for bothtransmitting and receiving and will be both a transmitting and receivingUE. Specifically, the UE may function as a transmitting UE when there isvoice activity by the user and a receiving UE when there is no voiceactivity, or it may simultaneously function as both a transmitting andreceiving UE.

The transmitting UE 201 is communicating with the base station 205 usingdiscontinuous transmission. Discontinuous transmission includes anytransmission protocol, which modifies the transmission format dependingon the data transmission requirements of the UE and in a particulardepending on the voice activity. In one form of discontinuoustransmission the information data rate varies in response to the useractivity of the UE. In the specific embodiment, the discontinuoustransmission between the UE and the base station employs silencedescriptor frames (SID frames). When the UE 203 detects voice activityit transmits voice data at a first rate. When no activity is detected ittransmits SID frames at a much lower rate. In the specific GSMembodiment one SID frames is transmitted for every 9 speech frames andthus the effective data rate for SID frames is 9 times lower than forspeech frames resulting in less power being transmitted and thusresulting in a lower power drain and reduced interference caused toother UEs.

A frame in a communications system has a similar function to an envelopein a postal system; it provides a container to carry information.Typically a communication system will support a number frames withdifferent functions, for example, some may be designed to carry userdata between users, whilst others may be designed to carry signallinginformation. Therefore, a frame comprises a variable number of elementsthat may include; a source and a destination address, a description ofthe contents of the frame (eg its type, length etc), a data payload thatmay be user data or signalling data, data used to detect and correcterrors introduced by the channel.

In the context of this embodiment, we are concerned with the generationand transmission of speech frames, and when the speech codec determinesspeech is not present, of silence descriptor frames,

The SID frames comprise silence descriptor information. This isinformation generated in response to the background noise detected bythe UE. In GSM the silence descriptor information is not intended toallow accurate reproduction at the receiving UE but only to generatesome background noise which is related to the background noise at thetransmitting UE. In its simplest form the silence descriptor informationmay simply consist in an indication of the relative volume level of thebackground noise, and in more advanced embodiments it may comprisefurther information such as frequency response of the noise etc. Thenature of the information in the SID frame is determined by the voicecoding-decoding algorithm (Codec) used in the communication system. Forexample, the GSM codec transmits a number of parameters that define theshape of a filter characteristic and a number of parameters thatdescribe the excitation to that filter; the silence descriptor in thiscase will thus contain a filter parameter set and excitation that modelsthe back ground noise in the vicinity of the transmitting user.

Further description of SID frames in a GSM environment can be found in“The GSM System of Mobile Communications” by Mouly and Pautet, Cell &Sys, 1992, ISBN 2-9507190-0-7.

The transmitting UE 201 comprises a microphone 221 for converting thevoice of the user to an electrical signal. This signal is fed to avocoder 223, which digitises the signal, and codes in accordance with avocoding technique as is well known in the art. The vocoder is connectedto a speech metric processor 225, which is operable to determine if thedigitised signal from the vocoder corresponds to a speech signal or to arepresentation of background noise. The speech metric processor 225 isconnected to the transmitter for transmission of speech frames if voiceactivity is detected, or to a silence descriptor generator 229 forgeneration of a SID frame if no voice activity is detected. The silencedescriptor generator 229 is connected to the transmitter 227 fortransmission of the SID frame. A conference call processor 23,1 operableto detect if the UE is in a conference call, is connected to the silencedescriptor generator 229.

FIG. 3 illustrates the flow chart of a method 300 of performing aconference call in the UE in accordance with this embodiment of theinvention.

In step 301 the next frame is processed, meaning that the vocoderprocesses a segment of the speech signal and generates a vocoded datapacket corresponding to that segment or interval. In GSM the vocoderprocesses 20 msec speech segments. In step 303 the generated speechframe is evaluated to determine if it corresponds to a speech signal orwhether it corresponds to background noise. In the preferred embodimentstep 303 is carried out in the speech metric processor 225 and operatesby comparing the characteristics of digitised signal in the transmittingUE to characteristics typical of a speech signal. The speech metricprocessor 225 may specifically examine the speech coder metrics todetermine if a transmitted frame is probably speech or not. Suchfunction might be performed by determining the vector space that thevocoder metric spans with speech, then rejecting frames with metricsoutside that boundary (this function might be enhanced by a usertraining her phone). Such function might also be performed by a speechrecognition function, especially given that such recognition functionsare already existent in phones today.

If the speech metric processor 225 determines that the frame correspondsto speech, the method continues in step 305 by the speech frame beingfed to the transmitter 227 for transmission in accordance with thetransmission protocol of the communication system. If it is determinedthat the frame does not correspond to speech this information is fed tothe silence descriptor generator 229.

The conference call processor 231 executes step 307 and determines ifthe UE is taking part in a conference call or not. This determinationmay be based on for example information received from the networkindicating that the call is a conference call, by the UE automaticallydetermining so based on the call id (e.g. certain phone numbers may bepre-designated as conference call numbers) or by automatic detection inthe UE of received speech from a plurality of voices. However, in thepreferred embodiment it is simply done by the user indicating to the UEthat the current call is a conference call, for example by pressing abutton on the exterior of the UE. The information as to whether the callis a conference call is fed to the silence descriptor generator 229.

If the call is not a conference call the silence descriptor generatorproceeds in step 309 by transmitting the SD frame in accordance with thetransmission protocol of the communication system.

If the call is a conference call, the silence descriptor generatorproceeds in step 311 by suppressing the silence descriptor informationin the SID frame. This may for example be done by modifying the SIDframe already generated or generating a new SID frame containing thesuppressed information. In the preferred embodiment, the suppression isperformed by setting the silence descriptor information of the SID frameto that which is equivalent to complete silence. In other words, a zeroSID frame is generated either by generation of a new SID frame or byreplacing the information content of an existing SID frame.

The method then continues in step 313 by the transmission of thesuppressed SD frame using the standard transmission protocol for SDframes.

After transmission of a frame, the UE continues by processing the nextsegment or interval of the signal as long as the call continues.

The transmitted frames are received by the base station 205 and throughthe network 207 and other base station 208 transmitted to the receivingUE 211. The receiving UE may also receive transmissions on otherchannels (e.g. other timeslots for GSM) from other UEs 213,215 takingpart in the conference call. The receiving UE will combine these signalsinto a single audio signal. In its simplest form the UE processes allreceived signals independently and only combines them at the input tothe analogue amplifier which may be a summing amplifier.

In this embodiment, the receiving UE 211 comprises a receiver forreceiving the radio signals and generating the underlying data. Thereceiver is connected to a frame processor 235 which derives andanalyses the current data frame. The frame processor 235 is connected toa background sound generator 239 and a voice decoder 237. If the frameprocessor 235 detects that the received frame is a speech frame, itfeeds the signal to the voice decoder 237 that converts the signal intoa digital signal which is converted to an analogue signal by the Digitalto Analog Converter 241. The output of the Digital to Analog Converter241 is connected to an analogue amplifier which is connected to aloudspeaker 245 for generating an audio signal for the user of thereceiving UE.

If the frame processor 235 detects that the received frame is a silencedescriptor frame, it feeds the signal to the background noise generatorthat generates a background noise signal which is fed to the Digital toAnalog Converter 241 instead of the speech signal.

A very simple form of combining the signals of the differenttransmitting UEs is to process the frames received from the transmittingparties independently and in parallel and to add the resulting signalsat the input to the analogue amplifier. In this embodiment the UE eitheremploys a plurality of parallel components or the individual componentsare operable to independently process more than one signal. In this casethe method of receiving as described in the previous paragraphs isrepeated for each transmitting UE in the conference call.

It is clear that if no suppression of background noise is included thebackground noise from the different UEs will add up and worsen thespeech quality. Even if signals of low volume are muted a high level ofbackground noise (as for example from a UE being used in a car) willstill be allowed through causing a deterioration in the voice quality.However, by employing the described embodiment of the invention in thetransmitting UE the background noise of a non-active UE is suppressedresulting in no contribution to the background noise at the receiving UEthereby improving the perceived speech quality.

In other embodiments, the suppression of the silence descriptorinformation is not done by modifying the information content of thesilence descriptor frame but by ignoring the silence descriptor framecompletely. In one such embodiment, the transmitting UE 201 simplytransmits the SID frames irregardless of whether it is involved in aconference call or not. However, upon receiving the frames, thereceiving UE determines if it is in a conference call and if so itsimply suppresses the silence descriptor information by ignoring any SIDframes received from any transmitting UE. In this embodiment thereceiving UE may thus disconnect the background sound generator 239 whenin a conference call and reconnect it when in a conference call.

In a different embodiment, the SID frames are ignored in the network forexample by the SID frames not being routed to the destination.

In a different embodiment, the combination of the signals fromtransmitting UEs are not combined in the receiving UEs but in aconference application operable to form a conference bridge. Thisconference application will typically be implemented in the network orin a separate component connected to the network. In this embodiment,the signals from all transmitting UEs are communicated to the conferenceapplication and the resulting conference signal is distributed to allreceiving UEs. Note that typically all transmitting UEs will also bereceiving UEs and in order to avoid an echo effect the signal from a UEis not sent back to the same UE.

In this embodiment the conference application will simply combine allspeech frames but will ignore all SID frames unless no speech frames arereceived, in which case one or more SID frames may be used forgenerating background noise or may be forwarded to the receiving UEs.

It is within the contemplation of the invention that the functionsrequired for performing the conference call can be implemented anywheresuitable such as in the UEs, the network or the conference applicationor it can be distributed between these.

The determination if speech is originating at the transmitting UE may bedetermined in the transmitting UE or alternatively in the base stations,the network or the conference application in response to the speechframes received. For example, the conference application may comprise aspeech metric processor which evaluates, if the received framecorresponds to speech or background noise. Similarly the generation ofSID frames may be generated in the transmitting UE or alternatively inthe base stations, the network or the conference application.

The suppression of the silence descriptor information may also beimplemented anywhere in the system such as in the transmitting UE, thenetwork, the conference application or the receiving UE. Specifically,in one embodiment the determination if speech is originating at thetransmitting UE and the generation of SID frames is performed in thetransmitting UE and the suppression of the silence descriptorinformation is performed in a conference application, which alsoinherently determines that the current call is a conference call as itforms the conference bridge. This embodiment has the advantage of beingpossible to implement centrally without modification to UEs alreadyoperable to use silence descriptor frames.

Alternatively, the transmitting UE may determine that the call isconference call and set a flag in the SID frame to indicate this. Thesilence descriptor information may be automatically suppressed (forexample by modifying or ignoring the SID frame) anywhere in the systemupon detection that this flag is set. This functionality couldadvantageously be implemented in the network including the basestations.

It is within the contemplation of the invention that any suitable formof determining if speech is originating at a transmitting UE may beused. In a very simple embodiment, it is simply detected whether anysignal is picked up by the microphone of the receiving UE. If the signallevel is below a threshold it is determined that no speech isoriginating and if above this threshold it is determined that speech isoriginating.

The components and functionality described may be implemented in anysuitable manner to provide suitable apparatus. Specifically, thecomponents may consist of a single discrete entity, or may alternativelybe formed by adapting existing parts or components. As such the requiredadaptation may be implemented in the form of processor-implementableinstructions stored on a storage medium, such as a floppy disk, harddisk, PROM, RAM or any combination of these or other storage media.Furthermore, the functionality may be implemented in the form ofhardware, firmware, software, or any combination of these.

It will be understood that the invention tends to provide the followingadvantages singly or in any combination:

-   -   resulting audio noise is significantly reduced for conference        calls.    -   normal operation of the UE is not affected as the modified        behaviour only operates during a conference call. This is a        significant advantage as the absence of comfort noise in a        normal conversation is very unpleasant to users whereas it        actually provides improved quality for a conference call.    -   reduction in the utilisation of the communication channel        resources between the UE and the network arising from        suppression of SID frame transmission which may also reduce the        cost to the end user and/or network provider.

1. A method of performing a conference call in a communication systememploying discontinuous transmission and having a plurality of userequipment and a fixed network, the method comprising the steps of:determining if speech is originating at a first user equipment of saidplurality of user equipment; generating silence descriptor frames if nospeech is originating at the first user equipment; determining if a callinvolving the first user equipment is a conference call; and suppressingsilence descriptor information of the silence descriptor frames if thecall is a conference call.
 2. A method as claimed in claim 1 wherein thesteps of determining if speech is originating in the first userequipment and the step of generating silence descriptor frames areperformed in the first user equipment.
 3. A method as claimed in claim 1wherein the step of suppressing the silence descriptor information isperformed in the user equipment.
 4. A method as claimed in claim 1wherein and step of suppressing the silence descriptor information isperformed in the network.
 5. A method as claimed in claim 1 wherein andstep of suppressing the silence descriptor information is performed in aconference application.
 6. A method as claimed in claim 1 wherein thestep of determining if a call involving the first user equipment is aconference call is performed in the user equipment.
 7. A method asclaimed in claim 1 wherein the step of determining if a call involvingthe first user equipment is a conference call is performed in thenetwork.
 8. A method as claimed in claim 1 wherein the step ofdetermining if a call involving the first user equipment is a conferencecall is performed in a conference application.
 9. A method as claimed inclaim 1 wherein suppression of the silence descriptor information is bysetting the silence descriptor information of the silence descriptorsubstantially similar to that indicating silence.
 10. A method asclaimed in claim 1 wherein the step of determining if speech isoriginating at the first user equipment is performed by comparing thecharacteristics of a signal originating at the first user equipment tocharacteristics typical of a speech signal.
 11. A method of performing aconference call as claimed in claim 1 wherein the communication systemis a cellular communication system wherein the user equipmentcommunicates through radio signals with at least one base station of thenetwork.
 12. A communication system employing discontinuous transmissionand having a plurality of user equipment and a fixed network, thecommunication system comprising: means for determining if speech isoriginating at a first user equipment of said plurality of userequipment; means for generating silence descriptor frames if no speechis originating at the first user equipment; means for determining if acall involving the first user equipment is a conference call; and meansfor suppressing silence descriptor information of the silence descriptorframes if the call is a conference call.
 13. A communication system asclaimed in claim 12 wherein the communication system is a cellular radiomobile communication system
 14. A user equipment employing discontinuoustransmission and comprising: means for determining if speech isoriginating at the user equipment; means for generating silencedescriptor frames if no speech is originating at the user equipment;means for determining if a call involving the first user equipment is aconference call; and means for suppressing silence descriptorinformation of the silence descriptor frames if the call is a conferencecall.